Tableau Story Analyzing Performance of Baseball Players

Introduction

This is a data set containing information about 1,157 baseball players which includes their handedness (right or left handed), height (in inches), weight (in pounds), batting average, and home runs. I would be exploring this dataset and creating visualizations using Tableau software and will attempt to analyze how various measures impact the performance of baseball players.

Initial Visualization: https://public.tableau.com/profile/priyanka.swadi#!/vizhome/UdacityInitialProjectBaseball/Story1?publish=yes

Final Visualization: https://public.tableau.com/profile/priyanka.swadi#!/vizhome/UdacityFInalProjectBaseball/Story1?publish=yes

Summary

A variety of visualizations from Tableau were used to represent data from the Baseball players dataset. Relationships between height, weight, handedness and battling average and home runs were explored. To understand better the relationship between height and weight, a new parameter BMI (Body Mass Index) was also added and it's relationship to performance was explored.

Design

Initial Visualization:

The design decisions for the visualizations used in this Tableau story are explained below:

1.Single Variable Distributions:

The best way to visualize what range the values of height, weight and BMI took, was to plot a histogram. The bin size was adjusted slightly to visualize maximum information. The distributions turned out to be normal as was expected. Since these variables were continuous variables from the dataset, the were created into bins for plotting the histograms.

2.Scatterplot for Home Runs and Battling Average:

I wanted to take a look at how two of the measures, Batting Average and Home Runs were related to each other. The continous variables for Home Runs (HR) and Batting Average were used and a scatterplot with HR on Y axis and Average on X axis was plotted. Another variable of Handedness was also represented using Colors. The Color Blind color Palette was used for this representation.

3.Relationship between Handedness and Performance:

While Scatterplot above gave an insight into the range of values for Home Runs and Average, I used the simple Horizontal Bars to get an overview of how the handedness affected Home Runs and Batting Average. This was also the automatic plot suggested by Tableau for these measures. I used to Average measure for both Home Runs and Batting Average and again used to Color Blind color palette to represent them. I also created a bar plot for finding how many left/right/both handed players existed in this dataset. For these visualizations, the Handedness was used as a Dimension and the performance measures were used as continuous variables.

4.Scatterplots with Size for relationship between Height, Weight, BMI and performance:

X Axis - Height/Weight/BMI Y Axis - Batting Average Size - Average Home Runs

To depict both Size and Home Runs together in the same visualization, size seemed like a good marker to include. The bigger and higher the circles are, the better performace they indicate on the plot.

Final Visualization:

Most major features of the design remained the same. However some smaller changes were made to improve the visualizations:

-The Handedness feature encoded with colors was removed from the Scatterplot of Home Runs Vs Batting Average as it was not showing any useful information and the color points were overlapping making it harder to visualize handedness.

-Instead of the L,R and B fields for Handedness from the dataset, Right, Left and Both labels were added as aliases to make the visualizations more readable.

-The y-axis labels for histograms were changed to number of players instead of the default count of (quantity) labels which make better sense in representing what the quantity stands for.

Feedback

Feedback 1

Feedback 2

Feedback 3

Two seperate feedbacks received, indicated a problem with interpeting R, L and B in Handedness plots. They also indicated that scrolling down to see counts of handedness was not very convenient. The first feedback also indicated that the scatterplot was providing information about Handedness which was unnecessary in the particular visualization and causing difficulty in viewing the data points as well.

The third feedback received also suggested changing the labels of the y axis of histograms to number/count of players instead of the default generated label which I had initially used.

I agreed with the suggestions made by all reviewers and included those in the final visualization.

Conclusion

The baseball dataset was analyzed using Tableau and various visualizations were created to show relationships between Height, Weight and Handedness and the performance of players which was measured by Batting Average and Home Runs. A BMI parameter was added to create more effective visualizations.

Left Handed players had a better Home Runs and Batting Average than Right Handed players, while players using both hands had a slightly better performance with Batting Average.

Taller Players didn't have better performance, infact the best performance peaked at height 67 inches. Similarly, performance seemed to decrease with weight. The heavier players seemed to have better Home Runs though.

The best performing players had a BMI greater than 28.8.